Group-Wise Shrinkage Estimation in Penalized Model-Based Clustering

نویسندگان

چکیده

Abstract Finite Gaussian mixture models provide a powerful and widely employed probabilistic approach for clustering multivariate continuous data. However, the practical usefulness of these is jeopardized in high-dimensional spaces, where they tend to be over-parameterized. As consequence, different solutions have been proposed, often relying on matrix decompositions or variable selection strategies. Recently, methodological link between graphical finite mixtures has established, paving way penalized model-based presence large precision matrices. Notwithstanding, current methodologies implicitly assume similar levels sparsity across classes, not accounting degrees association variables groups. We overcome this limitation by deriving group-wise penalty factors, which automatically enforce under over-connectivity estimated graphs. The entirely data-driven does require additional hyper-parameter specification. Analyses synthetic real data showcase validity our proposal.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Penalized estimation of covariance matrices with flexible amounts of shrinkage

Penalized maximum likelihood estimation has been advocated for its capability to yield substantially improved estimates of covariance matrices, but so far only cases with equal numbers of records have been considered. We show that a generalization of the inverse Wishart distribution can be utilised to derive penalties which allow for differential penalization for different blocks of the matrice...

متن کامل

A Penalized Likelihood Estimation on Transcriptional Module-Based Clustering

In this paper, we propose a new clustering procedure for high dimensional microarray data. Major difficulty in cluster analysis of microarray data is that the number of samples to be clustered is much smaller than the dimension of data which is equal to the number of genes used in an analysis. In such a case, the applicability of conventional model-based clustering is limited by the occurence o...

متن کامل

Penalized model-based clustering with unconstrained covariance matrices.

Clustering is one of the most useful tools for high-dimensional analysis, e.g., for microarray data. It becomes challenging in presence of a large number of noise variables, which may mask underlying clustering structures. Therefore, noise removal through variable selection is necessary. One effective way is regularization for simultaneous parameter estimation and variable selection in model-ba...

متن کامل

Penalized Model-Based Clustering with Application to Variable Selection

Variable selection in clustering analysis is both challenging and important. In the context of modelbased clustering analysis with a common diagonal covariance matrix, which is especially suitable for “high dimension, low sample size” settings, we propose a penalized likelihood approach with an L1 penalty function, automatically realizing variable selection via thresholding and delivering a spa...

متن کامل

Rival Penalized Competitive Learning for Model-Based Sequence Clustering

In this paper, we propose a model-based, competitive learning procedure for the clustering of variable-length sequences. Hidden Markov models (HMMs) are used as representations for the cluster centers, and rival penalized competitive learning (RPCL), originally developed for domains with static, fixed-dimensional features, is extended. State merging operations are also incorporated to favor the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Classification

سال: 2022

ISSN: ['0176-4268', '1432-1343']

DOI: https://doi.org/10.1007/s00357-022-09421-z